A Transparent, Incremental, Concurrent Checkpoint Mechanism for Real-time and Interactive Applications
نویسنده
چکیده
TLB miss-based incremental, concurrent checkpoint mechanism for real-time and interactive applications called TIC-CKPT has been proposed, implemented and evaluated in this paper. TIC-CKPT allows setting the checkpoints overlaps with the execution of the chekcpointed processes. By resorting to tracking TLB misses to stop the first accesses to the target memory pages while saving memory address space to non-volatile storage. Meanwhile, a thread, which works in the privileged mode, copies the target pages to the designated memory buffer first, and then resumes the memory accesses. Finally the original pages in the designated memory buffer are used to construct a consistent state of the checkpointed process. From the experimental results, in contrast to a traditional concurrent checkpoint system, TIC-CKPT saves more than 2% of the checkpoint time and decreases the stopped time of the checkpointed process by around 10%. Moreover, concurrent incremental checkpointing has been designed and implemented in TIC-CKPT as well. Compared with a conventional incremental checkpoint approach, TIC-CKPT can reduce the downtime introduced by setting an incremental checkpoint to a great extent while the benchmarks keep the principle of locality.
منابع مشابه
Transparent Checkpoint-Restart of Multiple Processes on Commodity Operating Systems
The ability to checkpoint a running application and restart it later can provide many useful benefits including fault recovery, advanced resources sharing, dynamic load balancing and improved service availability. However, applications often involve multiple processes which have dependencies through the operating system. We present a transparent mechanism for commodity operating systems that ca...
متن کاملThe Performance of Consistent Checkpointing
Consistent checkpointing provides transparent fault tol erance for long running distributed applications In this paper we describe performance measurements of an im plementation of consistent checkpointing Our measure ments show that consistent checkpointing performs re markably well We executed eight compute intensive dis tributed applications on a network of diskless Sun workstations comparin...
متن کاملlibhashckpt: Hash-Based Incremental Checkpointing Using GPU's
Concern is beginning to grow in the high-performance computing (HPC) community regarding the reliability guarantees of future large-scale systems. Disk-based coordinated checkpoint/restart has been the dominant fault tolerance mechanism in HPC systems for the last 30 years. Checkpoint performance is so fundamental to scalability that nearly all capability applications have custom checkpoint str...
متن کاملBatch to Real-Time: Incremental Data Collection & Analytics Platform
Real-time data collection and analytics is a desirable but challenging feature to provide in dataintensive software systems. To provide highly concurrent and efficient real-time analytics on streaming data at interactive speeds requires a welldesigned software architecture that makes use of a carefully selected set of software frameworks. In this paper, we report on the design and implementatio...
متن کاملAccelerating incremental checkpointing for extreme-scale computing
Concern is beginning to grow in the high-performance computing (HPC) community regarding the reliability of future large-scale systems. Disk-based coordinated checkpoint/restart has been the dominant fault tolerance mechanism in HPC systems for the last 30 years. Checkpoint performance is so fundamental to scalability that nearly all capability applications have custom checkpoint strategies to ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- J. Inf. Sci. Eng.
دوره 29 شماره
صفحات -
تاریخ انتشار 2013